Chaotic mixed excitation source for speech synthesis

نویسندگان

Hemant A. Patil

Tanvina B. Patel

چکیده

Linear Prediction (LP) analysis has proven to be very powerful and widely used method in speech analysis and synthesis. Synthesis by LP-based approach is carried by exciting an allpole model (whose parameters are derived by LP analysis). Synthesis is carried by using mixed excitation source consisting of a sequence of impulses for voiced regions and white-noise source for unvoiced regions. In this paper, we present novel chaotic excitation source using chaotic titration method. The voiced and unvoiced regions in speech are modeled by chaos which is quantified by adding noise of known standard deviation (determined using chaotic titration method). It is observed that on an average for synthesized voices (both male and female), MOS increases from 2 to 2.4, DMOS from 2.1 to 2.4 and preference is increased from 39 % to 61 % via A/B test. PESQ score increases from 1 to 1.5 and MCD score decreases from 4.06 to 4.03, relatively for voices synthesized by proposed chaotic mixed excitation source. The relatively better performance of proposed approach is may be due to the novel chaotic mixed source of excitation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A mixed-excitation frequency domain model for time-scale pitch-scale modification of speech

This paper presents a time-scale pitch-scale modification technique for concatenative speech synthesis. The method is based on a frequency domain source-filter model, where the source is modeled as a mixed excitation. This model is highly coupled with a compression scheme that result in compact acoustic inventories. When compared to the approach in the Whistler system using no mixed excitation,...

متن کامل

Towards an improved modeling of the glottal source in statistical parametric speech synthesis

This paper proposes the use of the Liljencrants-Fant model (LFmodel) to represent the glottal source signal in HMM-based speech synthesis systems. These systems generally use a pulse train to model the periodicity of the excitation signal of voiced speech. However, this model produces a strong and uniform harmonic structure throughout the spectrum of the excitation which makes the synthetic spe...

متن کامل

NICT Blizzard Challenge 2010 Entry

This paper details a speech synthesis system developed at NICT for the Blizzard Challenge 2010. The system depends on an HMM-based speech synthesis technique that possesses two distinctive features: HMM training under global-variance constraint on the parameter trajectory and trainable mixed excitation for source-filter vocoding. For this year’s entry, we added some modifications to the system ...

متن کامل

Speech enhancement using voice source models

Autoregressive (AR) models have been shown to be effective models of speech signal. However, although it is the most common mode1 of speech, an AR process excited by white noise for speech enhancement, fails to capture the effects of source excitation, especidy the quasi periodic nature of voiced speech. Speech synthesis researchers have long recognized this ~roblern and have developed a variet...

متن کامل

Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT

A new control paradigm of source signals for high quality speech synthesis is introduced to handle a variety of speech quality, based on timefrequency analyses by the use of an instantaneous frequency and group delay. The proposed signal representation consists of a frequency domain aperiodicity measure and a time domain energy concentration measure to represent source attributes, which supplem...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Chaotic mixed excitation source for speech synthesis

نویسندگان

چکیده

منابع مشابه

A mixed-excitation frequency domain model for time-scale pitch-scale modification of speech

Towards an improved modeling of the glottal source in statistical parametric speech synthesis

NICT Blizzard Challenge 2010 Entry

Speech enhancement using voice source models

Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT

عنوان ژورنال:

اشتراک گذاری